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ABSTRACT 

This paper is based on the multidimensional scaling 
technique of Joseph B. Kruskal. It is comprised of three 
parts: The first part describes Kruskal's objectives and 
introduces his goodness of fit measure, called stress; the 
second part discusses some problems associated with Kruskal's 
technique, focusing on the concept of stress; in the third 
part, an alternate goodness of fit measure, called V, is 
proposed, together with a different procedure for doing multi- 
dimensional scaling. Part three also includes a discussion 
of the superiority of V over stress as a goodness of fit 


measure. 
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lL. MULTIDIMENSIONAL SCALING 

Like all statistical techniques, multidimensional scaling 
is a method of summarizing and drawing inferences from a large 
body of data. In this case, the data are the judgments made 
by a respondent about the similarities or differences between 
stimuli presented in pairs. For N stimuli, multidimensional 
scaling attempts to find N points in a t dimensional mapping 
whose interpoint distances (N(N-1)/2 of them in all) somehow 
resemble or match the corresponding N(N-1)/2 similarity- 
dissimilarity judgments made by the respondent. 

iicwripormeance of thegnumber t stems from itseimterpre- 
tation as the number of dimensions on which the respondent 
based his judgments. The best method for determining this 
number when the investigator is using the multidimensional 
scaling techniques to be discussed in this paper has been 
given by Joseph B. Kruskal. (Kruskal, 1964a) His method 
assumes the capability to derive a mapping for any number of 
dimensions (one, two, three or more) and then involves a 
comparison of these mappings of different dimensionality. 
Since the question of how to derive a mapping for an arbitrary 
number of dimensions is the main topic of this paper, the 
dimensionality of the mapping which multidimensional scaling 
seeks to derive will be two throughout this paper. The 
techniques for deriving a mapping are the same whether the 
dimensionality is one, two, three or more. Also, the mapping 
will always be in Euclidean space. The contents of this paper 
can be adapted with very little trouble, however, to non- 
Euclidean spaces based on a city-block metric or a Minkowski 


r metric, (Kruskal, 1964a) 








The discussion can be simplified by the use of an example. 
Suppose one is interested in identifying the dimensions of 
appeal of political candidates. What factors make some 
candidates attractive to a respondent and other candidates 
unattractive? For simplicity, suppose the investigator 
examines the feelings of one respondent with respect to four 
political candidates. Multidimensional scaling would help 
the investigator determine these factors or dimensions of 
appeal by providing him with a t (two in this case) dimen- 
Sional mapping of the candidates. The mapping would be 
based on judgments made by the respondent about the similar- 
ities or differences between the candidates presented in pairs. 

One method of eliciting the judgments of a respondent 
concerning the similarities or differences between candidates 
presented in pairs is to administer a simple questionnaire to 
him. A typical item in such a questionnaire might resemble 
the following: 

Please specify how similar or how different these 
two individuals are in their general appeal to you by 
circling one of the numbers, 1 through 9. If you 
Circle number 1, it implies that they are exactly 
equal in their general appeal to you, while if you 
Circle number 9, it implies that they are extremely 


different in their general appeal to you. 


‘Exactly Extremely 


Equal Different 
1. Lyndon B. Johnson i 2S Seer 6 ree: 82 


Hubert H. Humphrey 
If the respondent's feelings toward four candidates were to 
be examined, he would be Perea the same question about 5 other 
pairs of candidates, making a total of 6 questions in all. 


(4(3) 7/2) 





The basic premise underlying the analysis of data froma 
questionnaire of this kind is that the numbers circled are 
measures of psychological distance, closeness or proximity 
between stimuli for the respondent. Shepard calls them prox- 
imity measures. (Shepard, 1962a) Here, however, they will 
be called psychological distances. These psychological dis- 
tances will be labeled 6i4°8) with the i referring to one 
stimulus and the j referring to the other. The investigator 
only obtains N(N-1)/2 judgments from the respondent since Si 
equals S45 by assumption, and a special experimental design 
is required if Fr 1s to have any meaning. (If the assumption 
were dropped and the special design employed, the method of 
analysis would not change.) The formula N(N-1)/2 can be 
obtained by counting the elements in the lower triangular 
portion of an N by N matrix or by using the formula for the 
number of combinations of N objects taken two at a time, which 
is ei or N(N-1)/2. 

A number of computer-based procedures for doing multi- 
dimensional scaling are currently available. (Shepard 1962, 
Kruskal 1964, Lingoes 1965) However, the discussion in this 
paper will be limited to the most popular of these, the pro- 
cedure proposed by Joseph B. Kruskal in 1964. In addition to 
being the most widely used, Kruskal's technique is the best 
vehicle for the introduction of a slightly different technique 
in this paper. For the most part, Kruskal's notation will be 


used in the analysis to follow. 





Since the properties of the 915 5 will become important 
later, it should be noted that they are meaSurements on an 
Ordinal scale. In other words, the investigator can say that 
a a of 8 is greater than one of 5; however, the difference 
between the two (for example, 3) is not meaningful. The latter 
property accompanies both linear interval and ratio scales, 
but not an ordinal scale. To obtain interval proximity 
measures or psychological distances ie one would need 
an experimental model somewhat different from the one outlined 
by Kruskal and used in this paper. For example, interval 
measures can be obtained by the "method of multidimensional 
rank order," the "method of complete triads," or a number of 
other methods. (Torgerson 1958) All of these methods are 
based on the law of comparative judgment. It should be noted, 
however, that even the law of comparative judgment does not 
yield 6443's that are measurements on a ratio scale, a point 
that will become important later. (Thurstone 1920) 

As mentioned earlier, the investigator has obtained 
N(N-1)/2 distance judgments from the respondent. Let M equal 
N(N-1)/2. These psychological distances, $i5'Ss have a 
certain rank order: 


Cree Se ee mG oe OR LS Ek Oe se 
ale, to4 an M 


For example, a respondent might provide the following answers 
to a four candidate questionnaire: 

64 5=6 S147? 64,71 

61 3=8 So 4=/ 6, ,=2 





This would mean that 
§34<523<512<924<513<514 

Multidimensional scaling seeks to obtain a two (or t) 
dimensional mapping, called a configuration, of the stimuli 
for which the Euclidean (or non-Euclidean Lbiwtheygarce dessaed) 
distances between the stimuli have the same rank order as the 
psychological distances, or SiS: This is the isomorphism 
which multidimensional scaling seeks to create between the 
psychological distances or proximity measures and the inter- 
point distances in a Euclidean mapping. Let X; be a two 
and Xeos referring to the ith political 


il 


candidate's position in the two dimensional mapping in Eucli- 


dimensional vector, x. 


dean space. The Euclidean distance between the two candidates, 
1 and j, is the square root of the sum of squares of the 
distances along each axis, or by the Pythagorean theorem, 

z 2 
qi, = ST aa 

In the four candidate example, the investigator would 
want to find a two dimensional mapping of the candidates for 
Pimenec aad cd 9sd54<073<d)14. The only fixed Characteristics 
of the mapping are the relationships between the d;.'s. The 
axes can be rotated in any direction and the origin placed 
anywhere. Kruskal places the origin at the centroid of the 
configuration and normalizes the configuration by making the 


sum of the squared distances of the points from the origin 





acweene.  binekly, he "normalizes the angular attitude of 
the configuration by rotating it so that its so-called 
principal axes coincide with the coordinate axes (in the 


natural order)."? 


The principal axes rotation is very impor- 
tant in the achievement of a ee ltnentan for a different multi- 
dimensional scaling technique, that of Roger N. Shepard.’ 
However, it is not important for solution purposes in the 
Kruskal technique, although it might help the investigator 
in the interpretation of his results. 

Of course not all configurations of the points (particular 
mappings of the candidates) will yield d;.'s that have the 


y 


Same rank order as the O44 'S- Consequently, what the investi- 


gator needs and what Kruskal provides is an index to determine 
how close a given configuration comes to satisfying the rank 
order requirements which the $44 's place on the dj4's- This 
index is called stress. 

Prior to defining stress, Kruskal introduces a new set 
of symbols, called dis's. The dj's are numbers which com- 


pletely satisfy the rank order requirements given by the 6i4°'8- 


If the Te themselves satisfy these requirements, then the 


‘Kruskal, Joseph B., "“Nonmetric Multidimensional Scaling: 
A Numerical Method," Psychometrika, v. 29, p. 120, June 1964. 


*Shepard, Roger N., "The Analysis of Proximities: Multi- 


dimensional Scaling With an Unknown Distance Function," 
Poy ciomcerisa, V. 27, p. 132, June 1962. 


10 





Pomona sc could be, and in fact will be, identical to the 


J 
set of dj's. However, consider the following situation. The 
6:48 are in the order specified in the example used earlier, 
§34<923<512<524<513<914 ; 


and the mapping that has been obtained has the following 
eee S: 
1) 
d34=2 d12=3 Gi3-/ 


do3=l d24=4 dy4=6 


The rank order of the qj4's is the following: 


0 235434<4) 2<d94<4) 4<4) 3 
A set of numbers, Sees) that satisfy the rank order constraints 


set by the 6445's can be obtained in the following way: 


dpe aida as 55) / 2=1. 5 dg4=d24 
dj 2=4)2 dy3=dy4=(dy3tdyq4)/2=6.5 , 


Bomthat sd, ,<d,,<d, ,<d.,<d) <4, ,- 


This example demonstrates that the di, are based on 


averages of certain dij's- In the example, so-called "equality 
blocks" (for lack of a better name) were created for d34 and 


a. and Lor dae and ar by averaging d34 and doj3 to find 


d34 (=d, 3) and averaging d)3 and dj4 to find d (=d The 


13 414) - 


method of calculation of d;.'s for every Situation is part of 


J 
a technique called "monotone regression." (Miles 1959) 
Monotone regression is not discussed in any detail in this 


paper. However, one of its properties is that the differences 


11 





between the dij 's and the di4's computed in the example repre- 


sent the minimum differences between the distances, the di5 58, 
and any set of numbers satisfying the rank ordering specified 
by the 6445'S. 

In the above paragraph the point was made that iff the 


nS do not satisfy the rank order constraints, the di4's 


Will be averages of certain dis /S, as seen in the example. 


If the problem has M distances, then it can be shown that 


there are 2M-1 


“w 


d 


-l1 possible ways to average the di, to obtain 


S; or, if the case under which each qi 5 


is considered to be a degenerate type of 


ij. equals its 


respective di 5 
averaging, then 2-1 possible ways exist.? 

Another example may help. Suppose the investigator is 
dealing with three stimuli and consequently with three 


distances: dj9, 4,3, 423. The psychological distances are in 


the following order: 67 9<6}3<693. There are MeL eye A 


different ways to average dis's to obtain dj's. First of all, 


each dis may be equal to its respective djj, or 
oo Say 
(1) A 3=d) 3 
ors saree 


*The proof of this statement is a lengthy one that must 


be performed inductively. Since the number 2M-1 is not crucial 
to this analysis, the proof will not be given here. 


12 





Another possibility is that 


n A 
dy 9d) 3= (4,544, 3)/2 
(2) 


d23=423 . 
A third is that 
dy 2=41 92 
(3) if 


dy 370535 (4, 3+d53)/2 


The final possibility is that 


“aA “A 


a dy 27d) 37d 3= (dy 244, 34423) /3 
Monotone regression would lead to one of the four specifi- 
cations, depending on the order of the dij's obtained from a 
particular mapping. For example, given that 46)9<613<4693, 


the second specification would be appropriate if 


d,,-d 


Zeer 


4 2<43 


SLs OMe 
Each of the four specifications will be called a block 
equality system. In the fourth specification, the block 


equality is dj 2=d;)3=d23, by definition. In the third, d)3 


equals dj3 by definition, while in the second specification, 
di> equals Sigs by definition. There are no defined equalities 
in the first specification. 

Now that the method of obtaining the di 4's from the dij 's 


has been outlined and the concept of a block equality system 


as a defined equality between ag S has been introduced, 


13 





stress can be defined: 


Stress = 





The heart of Kruskal's technique is the derivation of 
the points (the X's) in the mapping, and subsequently their 
distances. Nonlinear programming becomes relevant at this 
point since the problem is to find the points and their 
distances that do the following: 

Minimize Stress 


Subject to: 


~~ 


aa Cee cs dee cd ee, 
2 - _ as ~ 


*2I2 “mm *wJ 

Kruskal employs the "method of steepest descent" to 
solve this problem. (Kruskal 1964b) His use of this method 
implies that he is treating the minimization as an uncon- 
strained one, since this method is generally employed in 
unconstrained minimization problems. (Spang 1962) As Spang 
points out, the use of the "method of steepest descent" for 
constrained minimization problems, which Kruskal in fact does, 
requires the construction of a Lagrangian and then the uncon- 
strained minimization of the Lagrangian. Kruskal uses the 
"method of steepest descent" but mentions neither Lagrangians 
nor the convexity assumptions that are normally made when 


minimizing a Lagrangian. It should also be noted, in passing, 


14 





that the formula for the gradient on page 125 (Kruskal 1964) 
is incorrect since it fails to take into account the fact 
that the di 4's change as the en change. 

A more conventional nonlinear programming approach to 
this problem shows that Kruskal's cecnnttene actually derives 
a solution to one of 2M-1 different constrained minimizations 
(nonlinear programming problems). There is one nonlinear 


programming problem for each different block equality system 


Or definition of the d..'s.’ The "method of steepest descent" 


1) 
leads to the solution to one of these 2™-1 gifferent problems. 


However, the "method" by itself cannot determine if the solu- 


tion to another of these 2M-l problems (where the qj4's are 


defined differently) would have a lower stress value than the 
one which it has derived. There could be 2™-1 other minima, 
under different block equality systems, that are smaller than 


the one yielded by the "method of steepest descent." (Of 


“Define, a priori, the relationship between the d;.~'s and 
and the dj+'s; then stress becomes a function of the dj; 
Plone e\the X"S Or poants in the mapping, ultimately). i con= 
SEaints Lepresemtced by 45,5842}, SSCA oe 54s iy ag = 
really Sore eee Lue pon the d-="s, since the relationship between 
Pence Ss ama the ds i5 's has aa, specified beforehand. The 
Brpbien of minimizing stress, then, has been transformed into a 
conventional nonlinear programming problem. However, there 
a 2M-1 possible relationships between the di4j's ane@™the 
's (block equality systems) and, consequently,~ 2™-1 nonlinear 
a problems. Of course, some problems may not have 
tT iene Since in some cases the constraints may imply a feas- 
ible region that is the null set. Spang (1962) contains a 
discussion of many techniques that could be employed to solve 
these constrained minimizations. Since the publication of that 
article, other techniques have been developed that might prove 
helpful. (Klingman 1963, Glass and Cooper 1965, Box 1965) 


15 





course, there could also be none,) A different approach to 


the problem that would bypass the above difficulty might be 


5 


possible. However, aS will be shown in the next part, in 


most cases, stress is not a good index of goodness of fit in 
the first place. Even if a better minimization technique 
were posSible, it would have no effect upon the suitability 
of stress as a measure of goodness of fit. Consequently, the 
next section will be devoted to a discussion of some of the 


problems inherent in the concept of stress. 


>There appears to be a certain ordering to the 2h-1 
different problems in the sense that stress is always lower 
when the d;-'s are defined in one way than when they are 
defined in,another way. For example, stress is always lower 
when each d; fomiemined tonle equdl tO Lts respective d.-— 
than ween wthe d;."s are defined in any other way. If this 
ordering could be determined, then the first nonlinear pro- 
gramming problem that had a feasible solution would be the 
one having the lowest minimum. The ordering may only be a 
partial one, however, which would complicate things 
considerably. 

16 





Ti. ess AS ATTIBASURE OF GOODNESS OF FIT 

A number of problems with Kruskal's goodness of fit 
measure, stress, become evident upon closer examination. One 
of these is the question of the meaning of stress, which, in 
Piece lated sto the problem of the specification of both 
minimum and maximum possible values that the index can attain. 
Another problem is the question of whether or not stress, 
which will be shown to be dependent on ratios between the 
di4'8- is an appropriate measure of the goodness of fit of 
the rank order of the dis to the rank order dictated by 
the 654 8: These problems of interpretation, maximum and 
minimum possible values and appropriateness of stress will be 
discussed in that order in this second part. 

The meaning of stress is the first question to be raised 
in this part. The square of stress would appear to lend itself 
to interpretation as the percentage of the variance of the 
qdjj's not conforming to the monotonic (rank order) requirements 
set by the Sa5 Be Under this interpretation, stress itself 
(not stress squared) would then be the square root of this 
number or the percentage of the standard deviation of the djj's 
not accounted for by the monotonic requirements. 

This interpretation of stress encounters problems as 
soon aS one examines the maximum and minimum possible values 
of stress. Intuitively, the minimum should be 0.0 and the 
maximum 1.0. Intuition is only partially correct in this case. 

Obviously, if the d;.'s perfectly satisfied the monotonic 


J 


requirements, then stress would be 0.0, since each di; would 


7 


- 
a =e — OO — 


” 
- 
—————— 


——_— 
_ = 





equal its respective d as discussed earlier. However, 


ae ad 
the conditions under which stress would be equal to 1.0, 
which would be the logical maximum under the "percentage of 
variance" interpretation, are not clearly defined. 


Aw 


Stress would be 1.0 if all dj4's were zero. However, 
Since the di,'s by definition cannot be less than zero and all 
are not allowed to equal zero at the same time, a degenerate 
solution which Kruskal disallows,® all the d's cannot equal 
zero. Similar problems arise if one tries to make each dj; 
equal to twice its respective Chae the other condition under 
which stress would equal 1.0. 

The following is a short proof that in a particular 
problem it is impossible for stress to equal 1.0. (Under the 
"pbercentage of variance" interpretation, it should always be 
possible for stress to equal 1.0.) Once again, suppose that 
an investigator is uSing only three stimuli and the specified 
rank ordering of the distances is as before: 

es 23 
As mentioned earlier, there are four possible block 


equality systems for this problem: 
Say: 
(1) 134] 3 


d23=423 
‘euskal, ope city; eps 120. 
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os apa ae tdye) /2 


d93=d23 


Su Uh) 
(3) R pa 
dy 3=d23= (dj 3+d93) /2 


aN 


(4) Fy g=4y 3-5 3= (dy td) 34d53)/3 


Stress will then equal 


(dy 9-41 2) 7+ (dy 3-413) 7+ (dg3-do3) ° 





dy 2°41 3°+d23° 

If the first block equality system is used and a mapping that 
allows dy 0<d13<d53 is obtained, stress would equal 0.0. Con- 
sequently, if stress were to equal 1.0, it would do so under 
one of the other three block equality systems. Assume that 
stress can equal 1.0 under block equality system number two. 


Then the following relationships must hold: 
eid) eg ital fa) Jota -d..) 7 
a2 De 163 1s} Eyre Pe as eee Pe 
2 2 2 
diz +413 +d23 
‘lt 2 2 
Z( (dy 2-413) 2+ (dy 3-dy 2) * eo +413°+d23 
2 Dea: 2 2 2 
ih 2 2 2 
-4d) 9d) 3 = 2dj2 ge cAclay +4d53 


The last relationship is an obvious contradiction since 


the right side of the equation must be greater than the left 


159 





side (dj 520) unless all di,'s are equal to zero (which is not 
allowed). The same type of contradiction arises when block 
equality system number three is employed. Under block equality 
system number four, the following "illegal" statement is 
obtained: 

- 24 24) 3-24] 2dp 3-24) 349374) 9+) 3°+d2 37 

The lack of a clearly defined maximum of 1.0 for stress 
makes a "percentage of variance" interpretation difficult at 
best. However, these problems are not nearly aS Serious as 
those associated with the question of the appropriateness of 
stress aS a measure of goodness of fit. This question is very 
closely related to the discussion in the first part about 
levels of measurement. The reader may want to refer to that 
section at this time. 

Euclidean distances are numbers on a ratio scale. Both 
the differences between two distances and the ratios between 
two distances are meaningful. As mentioned earlier, the 
proximity measures or psychological distances will normally 
be measurements on an ordinal scale, although they may be 
measurements on a linear interval scale if the law of com- 
parative judgment is invoked. Whether the psychological 
distances are ordinal or interval measurements, the ratios 
between two 64/8 are not meaningful. 

The major problem with stress as a measure of goodness 


of fit of the dij's to the $45 's is that it depends on the 


20 





Parlouproperties of the eee The following example will 
demonstrate that stress can be reduced to a function of the 
ratios between dis 's- 

The same three candidate example will be used. Assume, 


once again that the respondent has specified that O79,<513<599- 


Further, suppose a mapping which has the following distance 
relationships has been obtained: 


d d 


LD Cale 


Sy Soe 
d713<423 
In order to insure that dy 9<d) 3<d53, block equality 


system number two must be used, or: 
Seimei 2 


do3=d23 . 


Under these conditions, 





2) 
(dy 9- (4y +4 3) ) 7+ (dz 3- (dy 2447 3)) 


Stress = 
2 2 2 
di2 +413 +d23 
Or, 
cuss = Gaia /2) 410 da sadagaw ey 
aa inde 135cae 
2 2 2 : 
Gig +4137 +453 


Let dy 3/4) 9 equal K and let do 3/4, 2 equal G. Then, 


2. 
cies = KT ~2Kt1 P 


242K2 +262 


2a: 





or stress is entirely dependent on the ratios do3/d)9 and 
dj 3/41 2- 
The problem is obvious. If the investigator is using 


ordinal 6;.'s, stress is supposedly a measure of how well the 


J | 
d:.'s fit the rank ordering specified by the. respondent, that 


1) 

is the rank ordering of the 6i4'S- This would seem to indicate 
that it should not be dependent on a property which the original 
data do not possess, that is the property that the ratios of 
the distances are meaningful. Notice the effect of G* in the 
above equation. Stress decreases as G* increases. (Recall 
that G equals dj3/dj,2-) The respondent might have specified 
that 653-6 and 619=9- Why should one obtain a lower stress 
value when d43=1000 and dj 5=2 than when d53=3 and dj 5=2? 

When the 6445'S are interval measurements, that is when 
the law of comparative judgment has been invoked, the same 


problem arises. Why should the measure of goodness of fit be 


-'s when the ratios 


dependent on the ratios between the qi 5 


Demvecnetne 6..°S are not meaningful? 


J 
An example of what can happen when stress is used as a 
measure of goodness of fit may help. Suppose an investigator 


wants to examine a 4 candidate situation and the respondent 


Specifies that 63,<693<6)2<694<6713<6)4. First, suppose he has 


obtained a mapping with the following dij's, 


d,,=l1 d 


34 om 


d53=2 d.,,=4 d,,=8 


24 


22 





: ’ ; | 2 
Stress in this situation equals [oq . Now suppose 


he obtains another mapping with the following set of dis 's: 


d34=2 dj 2=3 iGaies?° 
a= d5,=4 | Dae? 
For this second mapping, stress is equal to a 


Notice that the first mapping violated the rank ordering 
specified by the $i4 8 only once while the second mapping 
violated the rank ordering twice. Yet the first had a higher 
stress value than the second 

In the next part, a new index of goodness of fit will be 


proposed for the case or ordinal i oe and a very simple 


method of dealing with interval Si4'8 will be discussed. 
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fot. “HOWwelOlr tt EUCLIDEAN DISTANCES TO 
PSYCHOLOGICAL DISTANCES 


In this part a new index of goodness of fit, V, will be 
introduced. This index is sensitive to neither the size of 
the difference between two dis nor to the size of the ratio 
between two ey EE It also has a well-defined maximum of 1.0 
and a well-defined minimum of 0.0. This simple index is 
based on the number of violations of the order relations 
specified by the 6545's. A very Simple method of dealing 
with interval Si5 § will also be discussed in this section. 


Earlier, the point was made that if one were attempting 
to map N stimuli into Euclidean space, then there would be 
N(N-1)/2 psychological distances, iy associated with 
these stimuli, and likewise N(N-1)/2 Euclidean distances, 
di4's, associated with the mapping. Again, let M=N(N-1)/2. 


The respondent, by his answers, specifies a rank ordering for 


the 6..'s: 
1) 


The problem of multidimensional scaling is to find a 
mapping of the N stimuli. The Euclidean distances between 
the points (stimuli) in the mapping should have, as nearly as 
possible, the same rank order as that of the psychological 


distances. Implicit in this rank order are M(M-1)/2 constraints: 
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d <d: 


tM-12M-17> *éI2M 


For any particular mapping, then, a possible index of 
the goodness of fit of the mapping to the rank order constraints 
would be the number of violations of the constraints by the 
d..'s. In fact, this is the index that will be adopted, except 


1) 
for one obvious alteration. The number of violations of the 
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constraints by the ape should be expressed as a percentage 
of the maximum possible number of violations, or in other 


words, 


where A is the actual number of violations and M(M-1)/2 is 

the maximum possible number of violations. If V equals 0.0, 

no violations occur and the mapping perfectly satisfies the 
rank order constraints. If V equals 1.0, the mapping perfectly 
violates the rank order constraints. 

If this new index were adopted, the problem of multi- 
dimensional scaling would become the problem of finding a 
configuration that minimizes V. This minimization is very 
Similar to a problem encountered in mathematical programming, 
the derivation of an initial feasible point. (Klingman 1963, 
Hilleary 1966, Rosen 1961) A very popular technique for 
finding a feasible point is Hooke and Jeeves' direct search 
algorithm for unconstrained functions. (Hilleary 1966, 
Klingman 1963, Hooke and Jeeves 1961) 

The above references contain complete descriptions of 
the direct search algorithm. In order to use the algorithm 
to minimize V, one would start with an arbitrary configuration 
of the N points in t dimensions. The t coordinate values for 
each point would be the independent variables, making Nt 
independent variables in all. A univariate search is first 


performed, with each independent variable being changed by a 
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Small amount, one at a time, in order to determine the direc- 
tion toward the minimum. If this exploratory move succeeds 
in lowering the objective function, a "pattern move" is then 
attempted. A pattern move iS a move based on the directions 
of the last two (sometimes more than two) exploratory moves. 
Various modifications of the algorithm tend to differ with 
respect to the weights given to previously successful explora- 
tory moves. If the pattern move does not succeed in lowering 
the objective function, another exploratory move is attempted. 
Eventually, the exploratory move will be unable to lower the 
objective function. In that case, the step size of the search 
is reduced and another search is performed. The process is 
repeated until the step size reaches a predetermined minimum. 

iiewnewwndex, VV, ~vears a striking similarity to Kendall‘s 
tau, a commonly used rank correlation coefficient. (Kendall 
1962) In fact the two indices can be related by the following 
equation: 

tau = 1.0-2V : 

Kendall's tau was not adopted as the index of goodness of fit 
for two reasons. First, certain characteristics of V are 
Similar to characteristics possessed by Kruskal's index, stress. 


Pee crample, wa perfect fit of the d..'s to the nonmetric hy- 


3 
pothesis would yield a value of 0.0 for both stress and V. For 
both indices, a low value is interpreted as a good fit while a 


high value is interpreted as a poor fit. Of course, with 


Kendall's tau, a perfect fit would yield a value of 1.0. A 


pt | 





certain amount of consistency among indices of goodness of 

fit seems desirable, and consequently V should be the preferred 
index on this basis. Also, the "percentage of possible vio- 
lations" interpretation of V is intuitively appealing. 

A second reason for adopting V instead of tau is based 
on a disadvantage which both possess, but to a different 
extent. Neither V nor tau iS a continuous variable. V is not 
continuous Since the numerator A, 1S discrete. In any problen, 
the number of violations can be 0,1,2,3 and so forth up to 
M(M-1)/2. The difference between successive values of A is l. 
Kendall's tau has the same denominator as V; however, the 
numerator is different. The difference between successive 
values of the numerator is 2. In other words, V is on a more 
compressed scale than tau. (The formula relating the two 
indices also demonstrates this fact.) Kendall has shown that 
as the denominator in the expression for tau (M(M-1)/2 in this 
ase) becomes llarge, tau approximates a continuous variable.’ 
In fact, he has shown that for a denominator greater than 45 
(M greater than 10), tau can be considered to be a continuous 
variable. The compressed scale of V approaches continuity 
even faster than tau, since the difference between successive 
Values for the numerator rs 1, not 2. 

A necessary condition for continuity of the function 


relating values of the coordinates of the points in a Euclidean 


"Kendall, M. G., Rank Correlation Methods, 3rd ed., 
p. 69, Charles Griffin and Company, 1962. 
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mapping and V is that V itself be continuous. Consequently, 
if the only impediment to the continuity of the function is 
the lack of continuity of V (which is the case here), then as 
V approaches (at the limit) a continuous variable, the function 
will approach a continuous one. Most minimization techniques 
require the assumption of continuity of the objective function. 
Consequently, the index which allows the function to approach 
a continuous one faster (V in this case) should be preferred.’ 
It should be noted that stress is a continuous function. 
However, as mentioned earlier, the minimization of stress is a 
minimization under constraints. On the other hand, like the 
problem of finding a feasible point, the minimization of V is 
essentially an unconstrained one.’ 
Like Kendall's tau, V is obviously not sensitive to the 
magnitude of the difference between two djj's nor to the size 
of their ratio. As mentioned earlier, V has both a clearly 


defined minimum of 0.0 and a clearly defined maximum of 1.0. 


®In all likelihood, if the Hooke and Jeeves technique will 
work with V as the index it will probably work with tau as the 
index. In fact, one could probably adopt the Spearman rho as 
the goodness of fit measure if he desired to do so. 


°One set of constraints is operative in this problem. The 
dj="S are not allowed to egual zero. These constraints can be 
handled effectively by the insertion of a penalty function into 
the Hooke and Jeeves algorithm. This function would automatic- 
ally set the value of V equal to 1.0 when a configuration with 
one or more zero distances is tested. 
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Both of these properties contrast markedly with the properties 
of stress, which is dependent on the ratios between the dj4's 
and does not have a clearly defined maximum of 1.0. Also, 
Kendall has proposed a very simple way of dealing with ties.!° 
His method can be used in the computation of V, in the event 
that certain $i4's or certain di4's are equal. 

If the Hooke and Jeeves' technique, or some other algorithm 
will in fact minimize V, the index would appear to have another 
desirable property that Kruskal's stress does not clearly 
possess. Suppose an investigator were to obtain a mapping of 
14 political candidates that minimized V. For interpretation 
purposes, he might want to examine the constraints that were 
violated. It might happen that a large portion of the viola- 
tions (if not all of them) involved a particular stimulus, 
candidate number 1, for example. (That is, dis is not less 
that dz, eso it shoulda be-;sneithercwanrecdi5, dy, and. so ~fomEn.) 
This kind of result might indicate that the respondent based his 
judgments about candidates 2 through 14 on two dimensions (for 
example, liberalism and good looks) while randomly making 
judgments about candidate 1, or making them on the basis of 
something other than the dimensions of liberalism and good looks. 


This might be very important to an investigator who is trying 


to interpret the dimensions. 


mKemdall, ‘op. cit., pp. 34=48. 
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One can imagine instances when there might be a number 
of different mappings that will yield the same minimum V 
value, but with different constraints being violated. It would 
appear to be possible to eliminate at least some, if not all, 
of these mappings from consideration by choosing the one(s) 
with the smallest number of stimuli involved in violations. In 
fact, it may even be possible to insert this criterion into 
the minimization problem. 

Now that a new index of goodness of fit when the §i4's 
are ordinal measures has been derived and discussed, it should 
be clear what kind of index ought to be used when the Si4/s 
are interval measures. The easiest index to use would probably 
be the Pearson r. The problem would be one of seeking the 
maximum r between the 6;4's and the di5's, or the minimum 
negative of r. Again, the minimization would be an uncon- 
strained one. A direct search technique could again be used. 
The "optimum gradient" method or one of the other gradient tech- 
niques discussed in Spang (1962) might be more efficient than 
the Hooke and Jeeves technique in this case, however. Since 
r is continuous, the continuity problems inherent in the use 


Ore er tal do Mot arise in this case. 
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SUMMARY 

The formulation of the new measure of goodness of fit, 
V, and the discussion of the use of the Pearson r for interval 
data complete the line of argument followed in this paper. In 
the first part, multidimensional scaling was defined. The 
ordinal or, under certain conditions, interval nature of the 
data which are used as input into most multidemensional scaling 
techniques was discussed. Finally, one approach to multi- 
dimensional scaling, that of Joseph B. Kruskal, was discussed 
in detail. The second part highlighted three problems with 
Kruskal's measure of goodness of fit. These were the problems 
of interpretation, a well-defined maximum possible value and 
appropriateness of the index. Most of the discussion in the 
third and final part was concerned with a new meaSure of 
goodness of fit when the data are measurements on an ordinal 
scale. The relationship between V and Kendall's tau showed that 
V is essentially a measure of the rank correlation between the 
distances, dis 


stimuli, and the psychological distances or 6i5'Ss which an 


's, implicit in a particular mapping of the 


investigator obtains from a respondent. A method of minimizing 
V was suggested in this part and the problem of continuity was 
discussed. Among the desirable properties which V possesses 
are ease of interpretation, clearly defined maximum and minimum 
possible values and insensitivity to properties of the qdjq's 
which the 6434's do not possess. Finally, a straightforward 


extension of the use of the rank correlation between the dj4's 


and 6i4°s was proposed for the case when .the 6445's are 
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measurements on an interval scale, namely the Pearson r or 
linear correlation coefficient. The next task to be performed, 
in a subsequent analysis, is, of course, the programming of a 
technique for minimizing V. After a routine is implemented, 
the output should be systematically compared to output from 
Kruskal's routine. Once this task has been completed, the 
distribution of V under various conditions should be examined 
Pee tue. As David Klahr has pointed out (Klahr 1969), this 
type of analysis alone will allow the investigator to make 
probability statements about the goodness of fit value which 


he has obtained. 
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